Replication Package: Lyall and Zhukov, "Fratricidal Coercion in Modern War"
---------------------------------------------------------------------

Updated 10/02/2024.

# Table of Contents
1. [Overview](#overview)
2. [Data Sources](#data)
3. [System Requirements](#system)
4. [Instructions](#instructions)
5. [Mapping of Code to Figures and Tables](#instructions)


# Overview <a name="overview"></a>

The code in this replication package reconstructs the empirical analyses in Lyall Zhukov, "Fratricidal Coercion in Modern War". One R script will run all of the code to generate the figures and tables in the main body of the paper and all empirical results in the Online Appendix (`master.R`). The replicator should expect the code to run for about 3-4 hours, depending on system configuration.

## Statement About Rights

- [x] We certify that the authors of the manuscript have legitimate access to and permission to use the data used in this manuscript.
- [x] We certify that the authors of the manuscript have documented permission to redistribute/publish the data contained within this replication package. Appropriate permissions are documented in the `LICENSE.txt` file.

These data are licensed under a CC-BY-NC-SA license (Creative Commons-Attribution-NonCommercial-ShareAlike). See `LICENSE.txt` for details.

## Contents of Replication Package

- `README.md`: this file
- `code/`: directory with R scripts sourced by `master.R`
    - `master.R`: R script that executes all replication code in sequence
    - `run1_setup.R`: R script to install missing packages and dependencies
    - `run2_main.R`: R script to replicate all figures and tables in main text and online appendices
    - `functions.R`: R script with custom functions sourced by `run2_main.R` 
- `data/`: directory with data files used by `run2_main.R` 
    - `data_divisionlevel.RDS`: unit-level RKKA data
    - `data_battlelevel.RDS`: cross-national battle-level data
    - `data_warlevel.RDS`: cross-national war-level data
    - `front_ref.RDS`: front-level reference table
- `results/`: directory with files generated by `run2_main.R` (you may need to create this directory)
    - `fig*.png`: figures in PNG format 
    - `tab*.tex`: tables in LaTeX format 


# Data Sources <a name="data"></a>

## 1. Red Army Rifle Divisions in World War II

Data on **monthly orders-of-battle** are from Fes’kov, Kalashnikov and Golikov (2003). These orders-of-battle are used to establish the population of Soviet military units for our aggregate unit-month panel data.

Citation info:
- Fes’kov, V.I., K.A. Kalashnikov, and I.F. Golikov. 2003. *Krasnaya Armiya v pobedakh i porazheniyakh 1941-1945 gg.[Red Army in victory and defeat 1941-1945].* Tomsk: Tomsk University Press.

Data on **soldiers' service in the Red Army**, their unit assignments and battle-level outcomes during the Great Patriotic War are from the Memory of the People (Pamyat' Naroda) archive, maintained by the Russian Ministry of Defense, and pre-processed by Rozenas, Talibova and Zhukov (2024). The files provided with the replication package include aggregates of these records grouped by military units and months. The original data can be downloaded from https://pamyat-naroda.ru/. 

Citation info:
- Ministry of Defense of the Russian Federation. *People’s Memory (Pamyat’ Naroda)*, 2020. url: https://pamyat-naroda.ru/.
- Rozenas, Arturas, Roya Talibova and Yuri M. Zhukov. 2024. "Fighting for Tyranny: State Repression and Combat Motivation." *American Economic Journal: Applied Economics* 16(3): 44-75. url: https://doi.org/10.24433/CO.2039195.v1

Data on **NKVD officers' assignment to military units** are from the "Cadres of State
Security Organs of USSR" archive, maintained by the Russian human rights organization Memorial. The files provided with the replication package include the aggregate number of NKVD SMERSH/OO officers present in each unit-month. The original data can be downloaded from https://nkvd.memo.ru/. 

Citation info:
- Memorial. *Kadrovyy sostav organov gosudarstvennoy bezopasnosti SSSR. 1935-1939. [Cadres of state
security organs of USSR, 1935-1939].*, 2017. url: https://nkvd.memo.ru/.

Data on **soldiers' exposure to repression** are from the Victims of Political Terror archive, maintained by the Russian human rights organization Memorial. The files provided with the replication package include unit-level sums of arrests within a discrete distance of soldiers' birth locations. The original data can be downloaded from https://base.memo.ru/. 

Citation info:
- Memorial. *Zhertvy politicheskogo terrora v SSSR [Victims of political terror in the USSR]*, 2014. url: http://lists.memo.ru/.

Data on **pre-war officer purges in each unit** are from Churbakov (2004)'s database of 3288 repressed Soviet officers. The files provided with the replication package include unit-level sums of purged officers. The original data can be downloaded from http://www.rkka.ru/handbook/personal/repress/main.htm. 

Citation info:
- Churbakov, Dmitriy. *Repressirovannyye voennosluzhashchiye Krasnoy Armii [Repressed Red Army Personnel]*. 2004. url: http://www.rkka.ru/handbook/personal/repress/main.htm

Data objects provided as part of this archive:
- `data/data_divisionlevel.RDS`
- `data/front_ref.RDS`


## 2. Cross-national data

Data on **battle-level outcomes** are from Lehmann and Zhukov (2019)'s cross-national battle-level dataset. The original data can be downloaded from https://doi.org/10.1017/S0020818318000358. 

Citation info:
- Lehmann, Todd C. and Yuri M. Zhukov. 2019. "Until the Bitter End? The Diffusion of Surrender Across Battles." *International Organization* 73(1): 133–169.

Cross-national data on the use of **blocking detachments** are from Lyall (2020)'s Project Mars dataset. The original data can be downloaded from https://doi.org/10.7910/DVN/DUO7IE. 

Citation info:
- Lyall, Jason. 2020. *Divided Armies: Inequality and Battlefield Performance in Modern War.* Princeton University Press, 2020.

Data on **war-level outcomes** are from Correlates of War 4.0. The original data can be downloaded from https://www.correlatesofwar.org/. 

Citation info:
- Singer, J David and Melvin Small. 2010. *Correlates of War. Inter-State War Data: Version
4.0.* url: https://www.correlatesofwar.org/

Data objects provided as part of this archive:
- `data/data_battlelevel.RDS`
- `data/data_warlevel.RDS`




# System Requirements <a name="system"></a>

## 1. Software Requirements

Portions of the code rely on forking to run in parallel on multicore systems, which requires Linux or Mac OS. Single-core processing is possible on Windows if `mc.cores = 1` in `parallel::mclapply`.

R version 4.4.1 (2024-06-14)
- Platform: `x86_64-pc-linux-gnu` (64-bit)
- Running under: Ubuntu 22.04.5 LTS
- The script `run1_setup.R` will install all missing packages and dependencies, and should be run prior to executing `run2_main.R`
- Attached packages:
    - `data.table` (1.15.4)
    - `dplyr` (1.1.4)
    - `fixest` (0.12.0)
    - `Formula` (1.2-5)
    - `ggplot2` (3.5.1)
    - `lfe` (3.0-0)
    - `lme4` (1.1-35.3)
    - `MASS` (7.3-61)
    - `Matching` (4.10-14)
    - `Matrix` (1.6-5)
    - `nlme` (3.1-165)
    - `optimx` (2023-10.21)
    - `robustlmm` (3.3-1)
    - `stringr` (1.5.1)
    - `SUNGEO` (1.3.0)
    - `tidyselect` (1.2.1)
    - `xtable` (1.8-4)

## 2. Hardware Requirements

Approximate time needed to reproduce the analyses on a standard (2023) desktop machine is 3-4 hours.

The code was last run on a 16-core (Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz) laptop with 64 GB of RAM, running Ubuntu 22.04.3 LTS. Computation took about 3 hours. 


# Instructions to Replicators <a name="instructions"></a>

Extract the contents of the archive to a directory of your choice. Edit the preamble to `master.R` as needed to adjust the default working directory in `setwd()` to the unzipped folder's location on your system. The path should point to the directory that contains the folders `code/` and `data/`. To execute the code line by line, run the scripts in the following order (after setting the working directory in R):
    - Run `code/run1_setup.R` to set up the working environment. 
    - Run `code/run2_main.R` to replicate all figures and tables in the main text and appendix.
    
# Mapping of code to tables and figures <a name="mapping"></a>

| Figure/Table | Program | Lines |
| :--- | :--- | :--- |
| Figure 1 | `run2_main.R` | 45-90 |
| Figure 2 | N/A | N/A |
| Figure 3 | N/A | N/A |
| Figure 4 | N/A | N/A |
| Figure 5 | `run2_main.R` | 1144-1245 |
| Table 1 | `run2_main.R` | 661-813 |
| Figure A3.1 | `run2_main.R` | 161-237 |
| Figure A3.2 | `run2_main.R` | 242-298 |
| Figure A3.3 | `run2_main.R` | 300-332 |
| Figure A3.4 | `run2_main.R` | 337-384 |
| Figure A3.5 | `run2_main.R` | 388-462 |
| Figure A3.6 | `run2_main.R` | 549-586 |
| Figure A4.7 | `run2_main.R` | 820-877 |
| Figure A5.8 | `run2_main.R` | 969-1065 |
| Figure A5.9 | `run2_main.R` | 1068-1126 |
| Figure A6.10 | `run2_main.R` | 1259-1296 |
| Figure A6.11 | `run2_main.R` | 1247-1256 |
| Table A1.1 | `run2_main.R` | 93-107 |
| Table A1.2 | `run2_main.R` | 110-125 |
| Table A2.3 | `run2_main.R` | 128-146 |
| Table A2.4 | `run2_main.R` | 148-157 |
| Table A3.5 | `run2_main.R` | 467-545 |
| Table A3.6 | `run2_main.R` | 467-545 |
| Table A3.7 | `run2_main.R` | 590-634 |
| Table A3.8 | `run2_main.R` | 636-642 |
| Table A4.9 | `run2_main.R` | 661-748 |
| Table A4.10 | `run2_main.R` | 751-779 |
| Table A4.11 | `run2_main.R` | 781-786 |
| Table A5.12 | `run2_main.R` | 899-929 |
| Table A5.13 | `run2_main.R` | 931-934 |
| Table A5.14 | `run2_main.R` | 938-965 |
| Table A6.15 | `run2_main.R` | 1144-1182 |
| Table A6.16 | `run2_main.R` | 1299-1373 |
| Table A6.17 | `run2_main.R` | 1376-1405 |